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Executive Summary 



In 2004, Achieve launehed a study to eompare the graduation exams in six 
states — Florida, Maryland, Massaehusetts, New Jersey, Ohio and Texas. The 
goal of this study was to help answer some basie questions about the expeeta- 
tions states are setting for their high sehool graduates through the use of exit 
exams: Do the tests refleet material that students should be familiar with by the 
time they eomplete high sehool? Is it reasonable to expeet all students to pass 
these tests before they graduate? If they pass these tests, does it mean students 
are ready for their next steps in life? 

The resulting report — Do High School Graduation Exams Measure Up? — 
eompared the eontent and rigor of the exams and the eut seores students need 
to aehieve to pass the tests. Aeross the states, we found that the tests do indeed 
set a floor for students that ean be responsibly defended as a graduation 
requirement, but they do not effeetively tap the higher-level skills that truly 
eonstitute “readiness” for eollege and work. 

In 2005, Aehieve was asked by the Hawaii Department of Edueation to eompare 
its 2005 grade 10 Hawaii State Assessment in reading and mathematies with the 
six states’ exams employing the same methodology used in the initial study. 
Although Hawaii does not require students to pass its grade 10 assessments to 
graduate, sueh a eomparison is nonetheless a useful exereise. Beeause the states 
that partieipated in the larger study together enroll nearly a quarter of the 
nation’s high sehool students, they provide a solid referenee point for Hawaii as 
it works to strengthen its grade 10 assessments over time. 

Speeifieally, the state asked Aehieve to: 

■ Analyze the eontent of the grade 10 assessments in reading and mathematies; 

■ Determine how well the assessments measure the skills neeessary to sueeeed 
in eollege and work; 

■ Analyze what it takes to pass the grade 10 assessments; and 

■ Compare the eontent and rigor of the assessments to those of other states. 

Findings for Hawaii 

After a detailed analysis of Hawaii’s grade 10 assessments in reading and mathe- 
maties, Aehieve found signifieant differenees between the Hawaii tests and 
those of other states. While the reading test is generally less demanding than 
those of other states, the mathematies test eontains eonsiderably more ehal- 
lenging eontent than tests from other states. Nonetheless, neither assessment is 
overly rigorous. Indeed, Hawaii — like other states — will need to develop 




assessments for use after grade 10 to help ensure that graduating seniors are 

on traek for sueeess in postseeondary institutions and today’s workplaee. 

Aehi eve’s major findings are as follows: 

Reading 

■ Hawaii’s reading test puts a premium on comprehension of informational 
text, which is exactly what colleges and employers say is essential for 
success in college-level courses across the curriculum and in the work- 
place. The reading test underseores the importanee of students being able 
to eomprehend informational text by seleeting passages that are mostly 
exposition (writing eontained in textbooks, artieles, reports and manuals), 
rather than literary prose and poetry, and developing test questions that 
probe students’ ability to deeipher informational text. Hawaii’s emphasis is 
appropriate. As researeh by Aehieve’s Ameriean Diploma Projeet (ADP) 
has demonstrated, whether students are bound for eollege or the work- 
plaee, they must be able to understand informational text. 

■ The passages on the Hawaii reading test tend to be less challenging than 
the reading passages on most of the other tests we analyzed, making the 
Hawaii test among the least rigorous. Hawaii’s reading test overall is 
among the least rigorous of the tests we analyzed, mainly beeause the 
reading passages on the test are of relatively low eognitive eomplexity, 
generally representative of upper middle sehool and early high sehool 
level reading. Moreover, while Hawaii’s test questions are more eognitively 
demanding on average than those in most other states, the questions fall 
short in not requiring students to analyze text beyond a superfieial level, a 
skill eritieal for sueeess in eollege and today’s work environment. 

■ Hawaii’s “Meets Proficiency” score in reading is comparable to the aver- 
age score of the other states’ passing levels. As was true of the other 
states in Aehieve’s prior study, Hawaii students ean pass the reading test 
with knowledge and skills that ACT eonsiders more appropriate for the 
test it gives to 8th and 9th graders than for its eollege admissions test. 

Mathematics 

■ Hawaii’s test in mathematics is well balanced and conta in s rigorous con- 
tent. The test gives more emphasis to geometry, algebra and data items 
than to number items, as is appropriate for a grade 10 test. Moreover, the 
eontent of the test is more rigorous than all but one of the states analyzed 
in Aehieve’s earlier study due to Hawaii’s relatively large proportion of 
“Advaneed Algebra” items — bringing the test eloser to the demands of 
eollege and work. This is parti eularly important in light of mounting evi- 
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dence that indicates that Algebra II is fast replaeing Algebra I as the gatekeeper 
eourse for sueeess in eollege and the high-skills workplaee. 

■ Hawaii’s “Meets Proficiency” score in mathematics requires students to kn ow 
slightly more challenging content than students who scored at the passing 
level on the other tests Achieve analyzed. While the eontent demand on the 
Hawaii assessment in mathematies is higher than those in other states, the 
test items themselves are less eognitively demanding on average than those of 
the six other states in the study, partieularly for items assessing number and 
geometry eontent. Thus, the test as a whole does not present unreasonable 
expeetations. In faet, from an international perspeetive, to pass the mathe- 
maties test, Hawaii students need to sueeessfully answer questions that, on 
average, eover material students in most other eountries study by grade 8. 

Recommendations for Improvement 

Aehieve’s analysis of six states’ graduation exams indieated that states must 

eontinue to raise the bar on their exit exams over time. 

These findings hold true for Hawaii, and we reeommend that the state: 

■ Raise the overall rigor of the grade 10 reading test. Hawaii should inerease 
the eomplexity of the reading passages on its assessment. Some passages on 
the reading test should represent the level of demand typieal of instruetional 
materials written for a late high sehool reading level to raise the eeiling on the 
test and signal the level of text students need to eomprehend to be on traek 
for attainment in postseeondary edueation and the new eeonomy. In addition, 
Hawaii should add items that measure the highest level of eognitive demand, 
requiring students to more deeply analyze text. 

■ Phase in higher cut scores on the reading test over time. In addition to 
inereasing the eognitive demand of its passages and/or items, Hawaii ean raise 
the rigor of its reading test over time by raising the seore required for passing. 
Texas is using this approaeh with its new graduation exam. This strategy only 
works if a test has enough range in what it measures, so that a higher seore 
aetually refleets more advaneed knowledge and skills. If a higher eut seore 
simply means that students must answer more of the same kinds of items eor- 
reetly, rather than items tapping more advaneed eoneepts and skills, it is not 
very meaningful to raise the eut seore. 

■ Raise the level of performance demand of the mathematics items. Although 
the eontent on the math test is ehallenging, the items themselves tend to 
make low-level demands in terms of performanee. Hawaii should raise the 
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cognitive demand of its items over time by inereasing the proportion of 
items that require eomplex problem-solving skills, problem formulation 
and advaneed reasoning. 

As it eontinues to raise expeetations, Hawaii must eontinue to invest in 
improving teaehing and learning by implementing exemplary instruetional 
materials, by releasing its tests, as Massaehusetts does, or by posting assess- 
ment exemplars on its Web site. Hawaii also should develop diagnostie, 
formative assessments for elassroom use; align professional development 
with its tests; and provide extra support to struggling students. Doing so will 
undoubtedly result in Hawaii’s produeing an ever-inereasing number of pro- 
fieient students, well prepared for the rigors of postseeondary edueation and 
the realities of a global eeonomy. 

Beyond the Grade 10 Assessment 

Tests administered in the 10th grade eannot fully eapture the range of eon- 
tent that students study in high sehool. Over time, Hawaii will need to go 
beyond its grade 10 test to develop a more eomprehensive set of assess- 
ments that measure the full set of knowledge and skills that indieate readi- 
ness for eollege and work. One possible approaeh is for Hawaii to develop 
end-of-eourse tests for subjeets sueh as Algebra 11 or upper-level English 
that are beyond the seope of its 10th grade tests. Sueh tests eould be fae- 
tored into eourse grades or ineluded on high sehool transeripts, and they 
would provide valuable information that postseeondary institutions and 
employers eould use in making admissions, plaeement or hiring deeisions. 

Finally, as eritieal as assessments are, they eannot measure everything that 
matters to a young person’s edueation. The ability to make effeetive oral 
arguments and eonduet researeh projeets are eonsidered essential skills by 
employers and postseeondary edueators alike, but these skills are not well 
assessed by a paper-and-peneil test. To ensure that these important skills 
are measured, Hawaii will need to work with loeal distriets to establish a 
systematie method for evaluating them aeross the state. 
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Achieve undertook its original 2004 study to provide edueators, polieymakers 
and the publie with a elearer pieture of what high sehool graduation exams 
measure and how diffieult they are to pass. Although Hawaii does not require 
students to pass its grade 10 assessment to graduate, its assessment, like those 
of the other six states, sets a reasonable floor for students. Therefore, Aehieve 
strongly eneourages Hawaii not to lower its expeetations. Rather, Hawaii should 
stay the eourse and ratehet up the level of demand of these assessments over 
time, and it should extend the high sehool assessment system to measure 
eollege- and work-ready skills. 

To help the state aeeomplish this, Aehieve eneourages Hawaii to eonsider join- 
ing its ADP Network, a group of 22 states that have pledged themselves to a 
poliey agenda in support of truly preparing students for sueeess in eollege and 
work by the time they graduate from high sehool. Eaeh state has eommitted to 
the following four aetions. 

■ Aligning high sehool standards with the knowledge and skills required for sue- 
eess in eollege and work; 

■ Requiring all high sehool graduates to take ehallenging eourses that aetually 
prepare them for life after high sehool; 

■ Streamlining the assessment system so that the tests students take in high 
sehool also ean serve as readiness tests for eollege and work; and 

■ Holding high sehools aeeountable for graduating students who are ready for 
eollege or eareers, and holding postseeondary institutions aeeountable for stu- 
dents’ sueeess onee enrolled. 

Although the Network has been in existenee for just over a year, Aehieve has 
already seen evidenee of substantial progress on the part of partieipating states. 
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I. Background 



In June 2004, Achieve published an analysis of the graduation exams in six 
states — Florida, Maryland, Massachusetts, New Jersey, Ohio and Texas. The 
study — Do High School Graduation Exams Measure Up? — compared the 
content and rigor of the tests, as well as the scores that students needed to 
pass those tests. 

The Hawaii Department of Education asked Achieve to undertake a study 
that would compare the 2005 grade 10 Hawaii State Assessment in reading 
and mathematics to the six state graduation exams based on the methodol- 
ogy employed in the larger study. It is important to note that Achieve ’s 
analysis is not an alignment study of how closely Hawaii’s grade 10 assess- 
ment measures its standards. Achieve’s alignment studies take somewhat 
different criteria into account and are based on a different methodology. 
Rather, this analysis is intended to explore the content and rigor of the tests 
in comparison to other tests and against common external benchmarks. 

Why Achieve Initiated the Study of Graduation Exams 

High school graduation exams are in place in nearly half the states, and 
more than half the nation’s high school students have to pass them to earn a 
diploma. More rigorous than an earlier generation of minimum competency 
tests initiated in the 1980s, these tests are a significant part of the decade- 
long movement to raise standards and improve achievement in the United 
States. They also have become a lightning rod for public debate. 

The attention exit exams have received is understandable and deserved. 
They are the most public example of states’ holding students directly 
accountable for reaching higher standards. For the most part, however, 
the public debate over high school exit exams has gone on without vital 
information about how high a hurdle they actually present to high school 
students. 

Achieve launched its 2004 study to provide educators, policymakers and 
the public with a clearer picture of what high school graduation exams 
measure and how difficult they are to pass. The states that participated in 
the study together enroll nearly a quarter of the nation’s high school stu- 
dents, making them an ideal point of comparison for Hawaii. 

Achieve’s methodology builds from describing the attributes and dimensions 
of individual test items, to grouping the items in meaningful categories, to 
identifying patterns and making comparisons among states. To ensure con- 
sistency in the way items are described. Achieve develops coding schemes 
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for each dimension and trains expert reviewers in their use. To ensure reliability, 
two reviewers independently code each individual item and reconcile any differ- 
ences in judgment before assigning final characterizations. Categorizing items 
allows us to construct an overall representation of each state test and to make 
cross-state comparisons. 

In general, the dimensions Achieve examines help unpack the content (what 
students need to know) and the level of performances (what students are 
expected to do with their knowledge) for each assessment. For instance, in 
terms of mathematics content, it is important to determine the kind of algebra 
being assessed — the proportion of algebra items that target prealgebra topics as 
opposed to topics typically addressed in Algebra I or Algebra II courses. In read- 
ing, for example, we are concerned with how much of each test is dedicated to 
assessing informational text as opposed to literary topics. 

In analyzing content. Achieve uses independently devised benchmarks, particu- 
larly in estimating the grade level of particular content. In mathematics, we use 
an international scale created as part of the Third International Mathematics 
and Science Study (TIMSS, which is now known as the Trends in International 
Mathematics and Science Study). In English, we use a scale adapted from a 
scale devised by ACT, Inc., to describe questions on its college preparatory and 
admissions tests. 

Judging the complexity of the performance or cognitive demand of each item is 
as important as characterizing its content. In mathematics, for example, stu- 
dents can be provided a formula and simply required to plug in appropriate val- 
ues, or they can be required to reason their way through a problem and solve it 
by selecting the applicable formula from a chart that is provided. The cognitive 
demand of reading items also can vary across a wide range: At one end of the 
spectrum, students can be asked to apply a relatively low-level skill, such as 
locating information in a text, while at the other end, they can be expected to 
perform a far more intellectually demanding task, such as making generaliza- 
tions by synthesizing information across different passages. 

Reading presents a unique situation since test questions usually are based on one 
or more passages on the test, and both passages and questions can run the gamut 
from easy and straightforward to difficult and complex. In the end, it is the inter- 
play of the items with the passages on which they are based that establishes the 
rigor of a reading test. To address this critical dynamic. Achieve developed a 
Reading Rigor Index (RRI), which is fully explained in the appendix. 



Finally, Achieve ’s exit exam analysis investigated what it takes for students 
to pass eaeh state test and how those expeetations eompare aeross states. 
Aehieve and experts from Miehigan State University devised a statistieal 
approaeh to allow eut seores from different states’ tests to be eompared. 
Mathematies tests were eompared on the TIMSS seale, and reading tests on 
a seale adapted from AGT’s skills hierarehy seale. Using this methodology, 
Aehieve was able to identify those items that students who seored at the eut 
seore were likely to answer eorreetly and to determine the knowledge and 
skills eneompassed by those items. This proeedure helped us show how 
ehallenging eaeh state test was to pass, relative to the other state tests 
ineluded in the study. 

Sinee eompleting its original analysis of the six state exit exams, Aehieve 
has eontinued to refine the dimensions it uses to eharaeterize mathematies 
items, as well as reading items and passages. More information about the 
methodology used in this analysis appears in the appendix. 
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II. How does Hawaii's policy regarding its high school assessment 
compare with that of other states? 



Hawaii and the six states that partieipated in Aehieve’s study of graduation exams 
have made different poliey ehoiees about the stakes they attaeh to the tests, the 
timing of their tests, the sehedule of implementation and the subjeets tested. 

First and foremost, Hawaii, unlike the other six states, does not require students to 
pass its high sehool assessment in reading and mathematies to earn a high sehool 
diploma. Hawaii, like Florida, Massaehusetts and Ohio, gives its tests for the first 
time to lOth graders. New Jersey and Texas give their exit exams in the 11th grade, 
while Maryland has ereated end-of-eourse exams, with the English exam given as 
early as the end of 9th grade. These states are also at different points in the rollout 
of the assessments. In Florida, Massaehusetts and New Jersey, the tests already 
eount for high sehool students, while in Maryland, Ohio and Texas, the tests will 
eount in the future. Finally, states also test different subjeet areas (see Table 1). 



Table 1: State policy context for high school assessments 



HAWAII 


Florida 


Maryland 


Massachusetts 


New Jersey 


Ohio 


Texas 


TEST 


Hawaii Grade 10 
Assessment in 
Reading and 
Mathematics 


Florida 

Comprehensive 
Assessment Test 


Maryland 
High School 
Assessments 


Massachusetts 

Comprehensive 

Assessment 

System 


High School 

Proficiency 

Assessment 


Ohio Graduation 
Tests 


Texas Assessment 
of Knowledge 
and Skills 



GRADE FIRST GIVEN 



10th 


10th 


End of Course 


10th 


11th 


10th 


11th 


YEAR FIRST GIVEN 


2002 


1998 


2001 


1998 


2002 


2003 


2003 


REPLACED ANOTHER EXIT TEST 


Yes 


Yes 


Yes 


No 


Yes 


Yes 


Yes 


SUBJECTS TESTED 


Reading; writing; 
mathematics 


Reading; 

mathematics 


English 1; 
algebra/data 
analysis; biology; 
government 


English language 
arts; mathematics 


English language 

arts/literacy; 

mathematics 


Reading; writing; 
mathematics; science; 
social studies 


English language 
arts; mathematics; 
science; social 
studies 



FIRST GRADUATING CLASS REQUIRED TO PASS 



Not required 


2003 


2009 


2003 


2003 


2007 


2005 


OPPORTUNITIES FOR STUDENTS WHO HAVE NOT PASSED TO RETAKE TESTS 


Not applicable 


Yes 


Yes 


Yes 


Yes 


Yes 


Yes 


OTHER POLICIES RELATED TO STAKES 


No tests are 


Students are 


Students can fail 


Appeals process 


State currently pro- 


State law allows 


Passing score 


required for 


permitted to 


any one subject 


uses statistical 


vides alternative, 


students to fail one 


for first two grad- 


graduation. 


substitute results 


assessment and 


comparison of 


performance-based 


of five tests and 


uating classes was 




on SAT or ACT to 


still meet require- 


GPAs in subject 


assessment given 


still graduate if 


lower than even- 




meet graduation 


ments by earning 


area courses of 


and scored locally, 


score is close to 


tual passing mark. 




requirements. 


a high combined 


passing and non- 


which will be 


passing mark and 








score across all 


passing students. 


phased out by 201 1 . 


GPA in subject is at 








four assessments. 






least 2.5. 
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III. How do Hawaii's tests compare with the graduation exams 
from other states? 



Reading 

TEST FEATURES 

The breakdown of questions, in terms of format, on Hawaii’s reading assess- 
ment, eompared to that of the states partieipating in Aehieve’s exit exam 
study, is shown in the table below. 



Table 2: Distribution of items and points on reading section of seven state tests* 



State 


Total questions 


Number of points 




HAWAII 


40 multiple choice 
5 constructed response 


40 

12 




Florida 


44 multiple choice 
8 constructed response 


44 

27 




Maryland 


20 multiple choice 
2 constructed response* 


20 

8 




Massachusetts 


34 multiple choice 
4 constructed response* 


34 

16 




New Jersey 


20 multiple choice 
4 constructed response* 


20 

16 




Ohio 


31 multiple choice 
7 constructed response 


31 

20 




Texas 


28 multiple choice 
3 constructed response* 


28 

9 





*ln Achieve's 2004 study, four states — Maryland, Massachusetts, New Jersey and Texas — included the direct assessment of writing as part 
of their exit exams and combined the reading and writing items into a single score. To ensure comparability in this study, we base our compari- 
son of Hawaii's reading assessment on only the reading portions of the six states' tests. 



ITEM TYPE 

Hawaii allots a greater proportion of its reading assessment to constructed- 
response items than do most of the states in Achieve's study. 

Hawaii’s high sehool assessment in reading awards a higher pereentage of its 
points (38 pereent) to eonstrueted-response items than other states do on 
average (32 pereent) and than any other state does exeept Ohio (39 per- 
eent) (see Chart 1). The inelusion of a high proportion of eonstrueted- 
response items is a strength of Hawaii’s reading assessment, sinee the state 
uses the format appropriately and to good advantage in assessing more eom- 
plex knowledge and skills. As indieated above, Maryland, Massaehusetts, 

New Jersey and Texas inelude a direet assessment of writing as part of their 
graduation tests, while Hawaii administers a separate writing test. Sinee the 
publieation of Aehieve’s initial study of graduation exams, Ohio has intro- 
dueed a separate writing test. 
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Chart 1: Percentage distribution of points by item type 



HAWAII 

Florida 
Maryland 
Massachusetts 
New Jersey 
Ohio 
Texas 
6-state average 




^ Multiple choice 
I Constructed response 



I I I I I I 

0 % 20 % 40 % 60 % 80 % 100 % 

Percentage of total points 



CONTENT OF READING PASSAGES AND QUESTIONS 

Hawaii's reading test consists mainly of informational passages. 

Achieve’s ADP found that both college professors and employers stress the 
neeessity of high sehool graduates being able to eomprehend a wide range of 
informational materials, sueh as periodieals, memoranda, manuals, teehnieal 
reports, and intrieate eharts and graphs. States seleet reading passages from a 
variety of genres in eonstrueting their reading tests, and there is no set pattern 
that states follow. The following genres are ineluded on one or more of the tests 
Aehieve analyzed: literary text (short story/novel exeerpt, poetry or drama), lit- 
erary non-fietion (essay, autobiography/biography or literary speeeh), exposition 
(news story or textbook/informational artiele), proeedural text/doeument (diree- 
tions or manual) and media (photograph or advertisement). None of the tests, 
ineluding Hawaii’s, eontains passages with graphie elements, sueh as eharts or 
diagrams, and none eontains passages that the National Assessment of 
Edueational Progress (NAEP) eharaeterizes as “argumentation or persuasive.” 
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In sharp contrast to what postsecondary institutions and workplaces say is 
required for sueeess, the six states’ graduation tests that Aehieve examined 
dedieated most of their passages and most of their test points to literary text 
and literary non-fietion. That is not the ease with Hawaii: The state wisely 
prioritizes understanding expository text. In awarding 73 pereent of its test 
points to items assessing students’ eomprehension of exposition, Hawaii is 
far above the average (14 pereent) of the other six states. Moreover, Hawaii 
is the only state to inelude a proeedural (a “how to”) passage on its assess- 
ment. Following instruetions to perform speeifie tasks, answer questions or 
solve problems, sueh as troubleshooting the failure of an applianee, is an 
important skill for students to develop. 



Chart 2: Distribution of points by reading passage genre 



Literary Text 



Literary Non-fiction 



Exposition 



Procedural Text 



Media 




38 % 



0 % 






14 % 




8 % 



73 % 



0 % 

I 1% 



I Hawaii 
I 6-state average 



0 % 10 % 20 % 30 % 40 % 50 % 60 % 70 % 80 % 

Percentage of total points 



* Achieve has revised the genre classifications used in its 2004 study to more closely reflect those adopted by NAEP in its 2009 Reading 
Framework. We include media, although NAEP does not, to characterize state choices of genre as accurately as possible. 
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Four of the six states in the exit exam study — Maryland, Massachusetts, 
New Jersey and Texas — focused their tests on literary text and literary 
non-fiction passages, as the distributions in Chart 3 indicate. Ohio and 
Florida dedicated a significant proportion of their points to exposition and 
procedural text, but not to the degree that Hawaii does. 



Chart 3: Percentage of points within reading passage genres 




Percentage of total points 



NOTE: Totals may not equal 100 percent due to rounding and to the exclusion of a third genre group, Media and Graphics, which appears only 
on the Maryland and Texas tests. 



The 2009 NAEP Reading Framework reinforces what college professors and 
employers advocate. It requires 70 percent of its reading passages to be infor- 
mational — a category that includes procedural text — and just 30 percent 
of passages to be literary. The upward shift from the previous framework’s 
requirement of 60 percent informational passages sends a clear signal to states 
regarding the kinds of skills students will need for success in postsecondary 
education and work. Hawaii’s reading test, in fact, exceeds NAEP’s stipulation by 
more than 10 percent, while all other states in the exit exam study are well 
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Literary Text 



Informational Text 
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below the guideline. With 80 percent of its reading passages devoted to 
exposition and procedural text, Hawaii’s focus is promising, but the state 
should continue to include narrative passages, often a dominant genre in 
English classrooms, to target literary comprehension as well as the under- 
standing of informational text. 



Chart 4: Genre of reading passages: Hawaii, NAEP and other states 



30% 



I 84% 
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NOTE: Totals may not equal 100 percent due to rounding. 

Hawaii's test questions emphasize comprehension of informationai text, 
whereas the other states' tests tend to focus on general comprehension, 
vocabulary and literary elements. 

While the process of selecting test passages and items is iterative, a state’s 
choice of genres for its reading passages clearly influences the content of its 
test questions. In view of Hawaii’s emphasis on expository text and its inclu- 
sion of a procedural passage, it is not surprising that the state allocates the 
majority of its test points to assessing students’ understanding of informa- 
tional content. Rather than concentrating on vocabulary and comprehen- 
sion in general — as the six states on average do — Hawaii’s test questions 
zero in on proficiency in comprehending informational text in particular. 
This finding is not characteristic of the other states. On average, the six 
states’ assessments allot 46 percent of total test points to fundamental read- 
ing comprehension topics (e.g., general comprehension of a word, phrase or 
paragraph and understanding the main idea of a selection) and just 15 per- 
cent to understanding informational and persuasive topics. Hawaii’s assess- 
ment follows a markedly different pattern in allotting just 18 percent of its 
test points to general comprehension and vocabulary topics, while dedicat- 
ing 61 percent to comprehension of informational content. 
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Chart 5: Reading point distribution by item content 




Hawaii’s awarding just 18 percent of its points to items that assess vocabulary 
and general comprehension differs markedly from the other states, where the 
range extends from 32 percent (Texas) to 61 percent (New Jersey). In keeping 
with its choice of expository text for the bulk of the reading passages on its 
assessment, Hawaii gives less weight to assessing literary elements (21 percent 
of test points) than five of the comparison states give; only Florida assigns even 
less weight (8 percent) to literary elements. In dedicating 61 percent of its 
points to assessing informational content, Hawaii towers above every state other 
than Florida. 



Table 3: Reading point distribution within item content categories 



Category 


HI 


Ft 


MD 


MA 


NJ 


OH 


TX 




Vocabulary and General Comprehension 


18% 


35% 


52% 


50% 


61% 


45% 


32% 




Literary Elements 


21% 


8% 


46% 


50% 


33% 


31% 


65% 




Information/Persuasive 


61% 


58% 


2% 


0% 


6% 


24% 


3% 





Hawaii’s emphasis on informational text is prescient and in general accord with 
both Achieve’s ADP findings and NAEP’s latest recommendations. Having made that 
statement, we offer a note of caution: ADP devotes one set of its English bench- 
marks to literature, and the NAEP Reading Framework (2009) requires that 30 per- 
cent of its passages be literary text (i.e., fiction, literary non-fiction and poetry). In 
addition, Hawaii’s own high school standards in language arts label standards 1^ as 
“Reading and Literature” and include a number of performance indicators calling 
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for students to read, understand and analyze literature. Hawaii should fine-tune 
the balanee of its passages and test questions to refleet the national guidelines 
more elosely and strengthen the alignment with its own standards. 

COGNITIVE DEMAND OF READING PASSAGES AND QUESTIONS 

Hawaii's reading passages are iess demanding than those found on the other 
tests Achieve examined. The state's reading assessment does not inciude any 
passages that represent the ievei of demand typicai of instructional materials 
written at a late high school level. 

To judge the eomplexity of reading passages, Aehieve’s reading experts ere- 
ated a six-point seale deseribing texts from relatively simple to quite eom- 
plex. The levels are based on sueh eharaeteristies as the speeialization of 
the voeabulary, the predietability of text struetures or organization, the 
eomplexity of the syntax, the level of abstraetness, the familiarity of the 
topie, and the number of eoneepts introdueed in the passage. Level 1 repre- 
sents upper-elementary reading. Level 2 and Level 3 represent middle 
sehool reading. Level 4 represents early-stage high sehool reading, and 
Level 5 and Level 6 represent later-stage high sehool reading. 

The average demand of Hawaii’s reading passages elusters at, and rarely 
extends beyond. Level 3. The reading passages employed by Texas and 
Maryland have an average demand at Level 4, and New Jersey’s demand 
stretehes toward Level 5. 



Chart 6: Average reading passage demand 
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A closer look at the breakdown of seleeted states’ passages by level of demand is 
instruetive. As noted earlier, Hawaii’s reading assessment most resembles Ohio’s and 
Florida’s tests in the attention paid to informational text, as opposed to literary text. 
However, Ohio’s and Florida’s tests eontain reading passages with a higher level of 
demand than Hawaii’s. The demand of Hawaii’s reading passages eenters at Level 3, 
while Ohio’s extends to Level 4 and Florida is eentered at Level 4. The remaining 
states’ passages tend to foeus on the upper levels of the reading demand they 
present to students. While it is fair to note that New Jersey’s and Texas’ tests are 
administered in grade 11, one would expeet to see more passages on Hawaii’s 
assessment that target Level 4, and some passages that target Level 5 and Level 6. 



Table 4: Reading passage demand distribution 



Level 


HAWAII 


Florida 


Ohio 


6-state average 




1 


0% 


0% 


14% 


5% 




2 


4% 


0% 


37% 


10% 




3 


94% 


37% 


25% 


29% 




4 


0% 


63% 


24% 


33% 




5 


1% 


0% 


0% 


12% 




6 


0% 


0% 


0% 


11% 





The questions on Hawaii's reading assessment are generally challenging, requiring 
students to go beyond the level of fundamental comprehension to making complex 
inferences and generating explanations. 

Sinee eompleting its analysis of state graduation tests in 2004, Aehieve found 
that a finer delineation of the eognitive demand of reading questions eould be 
had by splitting the eategory of inferenee into two eategories — low versus high. 
The resulting five eategories are Literal Reeall, Low Inferenee, High Inferenee, 
Gonstruet and Analyze. 
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Inference items require students to deduce a meaning that is not explicitly 
stated in the text. Low inference items require students to make simple 
deductions — for example, identifying the main idea of an uncomplicated 
piece of text. High inference items are more cognitively taxing, requiring 
students to make more subtle deductions — for example, identifying the 
theme of a complex literary narrative. Achieve applied the distinction 
between low and high inference to the six states that participated in the 
original study, re-analyzing the 2004 data for the six states to obtain the 
results shown in Chart 7. 



Chart 7: Inference cognitive demand levels 




Percentage of total points 



One effect of refining the category of inference is that it helps reveal a spe- 
cial strength of Hawaii’s reading assessment. To be specific, in comparing 
the cognitive demand of Hawaii’s questions to each of the six states studied, 
we find that Hawaii allocates a greater proportion of its test points to the 
high inference category than do most of the states. 
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However, to fully appreeiate how the eognitive demand of the items on 
Hawaii’s reading test eompares with that of the other states, it is important 
to examine the states’ overall distribution of item points in the three most 
eognitively demanding eategories — High Inferenee, Gonstruet and Analyze. 
Hawaii’s reading test alloeates a total of 72 pereent of item points to these 
upper eategories, while states, on average, alloeate 64 pereent. Only two 
states — Texas at 86 pereent and New Jersey at 75 pereent — alloeate a 
greater pereentage of test points to the most eognitively demanding eate- 
gories, and only New Jersey ineludes questions at the Analyze level. 



Table 5: Distribution of points by low and high cognitive demand 



Level of cognitive 
demand 

Literal Recall and Low Inference 
High Inference, Construct, Analyze 



Ft 


MD 


MA 


NJ 


OH 


TX 


6-state 

average 


52% 


42% 


36% 


25% 


47% 


14% 


36% 


48% 


58% 


64% 


75% 


53% 


86% 


64% 



Hawaii’s laek of analytieal questions may, in part, stem from the state’s hav- 
ing fewer literary passages on its reading assessment than the other states 
studied. Aehieve has found that on large-seale state assessments, items eall- 
ing for analysis tend to address narrative reading passages more frequently 
than informational reading passages. However, this need not be the ease. 
College preparatory tests, sueh as the ACT Assessment and SAT Reasoning 
Test, inelude items at this higher level of eognitive demand for both narra- 
tive and informational texts. 




I 1 1 1 1 1 1 1 1 
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NOTE: Totals may not equal 100 percent due to rounding. 




RIGOR OF THE ASSESSMENT 



As was true of the other states studied, most questions on Hawaii's reading 
test target skills meant to be taught and learned by grade 8 or 9. 

To gauge the approximate grade level of the eontent on the state exit exams 
in English language arts, Aehieve used an index based on one originally 
ereated by ACT, Ine., to eharaeterize its series of assessments. The index is 
a eomposite seale that takes into aeeount the eontent of a test, the eognitive 
demand of reading passages and the eognitive demand of questions. ACT 
established six levels to differentiate the knowledge and skills that are 
measured on its reading assessments: Levels 1 through 4 eover knowledge 
and skills found on AGT’s EXPLORE test, whieh is given in the 8th or 9th 
grade; AGT’s PLAN test, whieh is given in the 10th grade, ineludes test 
items from Level 1 through Level 5; and the AGT Assessment — whieh 
students take in the 11th and 12th grades and whieh eolleges use in admis- 
sions, eourse plaeement and guidanee deeisions — ineorporates items from 
Level 1 through Level 6. 

As is elear from Table 6, none of the states’ tests, ineluding Hawaii’s, 
approaehes the level of demand that the AGT says is eharaeteristie of its 
eollege admissions test. On the eontrary, the vast majority of points (84 per- 
eent) aeross the six states link to AGT Level 1 through Level 4, meaning the 
level of demand aeross the six tests most elosely resembles that of the AGT 
EXPLORE test — whieh is given to students in 8th and 9th grades. This 
finding holds true for Hawaii’s reading test as well; its profile refleets that of 
the AGT’s EXPLORE test. 

A elose look at the strueture of Hawaii’s test shows it has a smaller pereent- 
age of points at Level 5 and Level 6 (1 pereent) than the average of the six 
states (14 pereent) but a signifieantly higher pereentage of Level 4 items 
than any of the other six states. The latter finding stems mainly from the 
faet that AGT assigns higher levels of demand to questions about expository 
text and that genre is featured in Hawaii’s test. None of Hawaii’s questions, 
however, qualify as Level 5 beeause the related reading passages — as indi- 
eated by Hawaii’s profile on Aehieve ’s six-point seale of passage eomplexity 
— laek the eomplexity required to generate more eognitively demanding 
questions. 
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Chart 9: Content on reading tests on ACT scale, Hawaii versus six-state average 



^ Hawaii 
I 6-state average 




Table 6: Content on reading tests on ACT scale 



Level 


ACT EXPLORE 
(8th and 9th 
grades) 


ACT PLAN 
(10th grade) 


ACT Assessment 
(11th and 12th 
grades) 


HI 


FL 


MD 


MA 


NJ 


OH 


TX 


6-state 

average 


1 


10-20% 


5-15% 


5-15% 


10% 


10% 


24% 


7% 


3% 


16% 


3% 


10% 


2 


20-30% 






11% 


42% 


24% 


21% 


22% 


51% 


11% 


29% 


3 


30-40% 


20-30% 


10-20% 


28% 


19% 


20% 


11% 


19% 


27% 


24% 


20% 


4 


15-25% 


20-30% 


20-30% 


49% 


21% 


26% 


32% 


25% 


6% 


38% 


25% 


5 


0% 


25-35% 


25-35% 


1% 


8% 


6% 


29% 


31% 


0% 


24% 


16% 


6 


0% 


0% 


20-30% 


0% 


2% 


0% 


0% 


0% 


0% 


0% 


0% 



NOTE: Totals may not equal 100 percent due to rounding. 



As noted earlier, of the six comparison states, Hawaii’s assessment most 
closely resembles those of Ohio and Florida in the emphasis given to infor- 
mational text. However, Hawaii’s assessment ends up with a higher level of 
cognitive demand because it contains more challenging questions than do 
Ohio’s and Florida’s tests. This trait helps offset Hawaii’s relatively unde- 
manding reading passages. 
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Chart 10: Content on reading tests on ACT scale, Hawaii versus Ohio and Florida 
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NOTE: Totals may not equal 100 percent due to rounding. 

The overall rigor of Hawaii's reading test is below that of most of the other six 
states. 

The difficulty of a reading test is determined both by the complexity of the 
reading passages and by the cognitive demand of the questions about those 
passages. To capture this important interaction, Achieve developed a 
Reading Rigor Index (RRI) that combines the cognitive challenge level of an 
item with the difficulty level of the passage that the item targets. (Note: Gut 
scores are not factored into the RRI. See appendix for more information on 
the RRI.) 

On this interactive scale, Hawaii, with an average index of 6.0, falls a bit 
below the six-state average of 6.5. The Hawaii, Florida and Maryland tests 
are roughly equivalent in terms of reading rigor, and all three tests are more 
rigorous than Ohio’s test. The New Jersey and Texas tests are the most rig- 
orous, followed by Massachusetts’ test. It is worth noting that the two most 
rigorous tests — Texas and New Jersey — are given in the 11th grade, 
whereas the rest are 10th grade tests, except for Maryland’s, which is an 
end-of-course test. 
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Chart 11: Reading rigor levels 
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Hawaii structures its reading test quite differently than other states do on 
average, as is evident when we eompare the overall reading rigor profile of 
Hawaii’s test with the average profile of the eomparison states. Like the 
other states’ tests, Hawaii’s shows an inerease in pereentage of questions 
with higher demand in the interplay of its passages and questions at the 
lower levels of the index, but then it abruptly tops out at an index of seven 
In faet, unlike the other states’ tests, 99 pereent of the items fall between 
Level 3 and Level 7 on the index. The remaining states on average show a 
more symmetrieal pattern of passage and question interaetion. This is due 
to the low level of demand of the reading passages on the Hawaii test, rela- 
tive to those on the other states’ tests. 




Chart 12: Reading rigor profile, Hawaii versus six-state average 
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SUMMARY OF FINDINGS 

In some ways, the Hawaii reading assessment is on par with the six state 
tests in Aehieve’s original study. In terms of its approximate grade level (as 
measured by the ACT seale), Hawaii fares similarly to the six eomparison 
states. All of the tests have a level of demand that most elosely approxi- 
mates act’s explore test, whieh is given to students in grades 8 and 9. 
And the Reading Rigor Index indieates that Hawaii’s test is about as eogni- 
tively demanding as Maryland’s and Florida’s tests and is more demanding 
than Ohio’s test (although it is less demanding than Texas’ and New Jersey’s 
tests). 

Nonetheless, the overall rigor of the Hawaii State Assessment in reading is 
less ehallenging than the six states in Aehieve’s original study of graduation 
tests, due to its relatively undemanding reading passages. In the end, it is 
the low demand of the passages and not the items themselves that reduees 
the rigor of Hawaii’s test. 
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Mathematics 



TEST FEATURES 

Hawaii’s test contains 77 items, more than any other test we examined. It 
is administered in four separate sections, totaling roughly three hours of 
actual testing time. This is slightly above the norm: Most other state tests 
we analyzed take two to two and a half hours. For the Hawaii test. Achieve 
examined 49 core items — worth a total of 75 points — that contribute to a 
student’s “Meets Proficiency” level score. Of the 28 items that Achieve did 
not examine, seven were field-test items, and 21 were SAT-9 items used to 
provide a norm-referenced score for each student. The breakdown in terms 
of item format of Hawaii’s mathematics assessment, as compared to that of 
the participating states in Achieve’s study, is shown in the table below. 
Achieve’s constructed-response item category includes both short-response 
and extended-response items. It is also important to note that Hawaii pro- 
hibits the use of calculators on its state assessment, unlike the six states in 
Achieve’s graduation test study, all of which permitted students to use cal- 
culators for all or part of their tests. 



Table 7: Distribution of item points on mathematics assessments 



State 


Points 


Total 


Total time 


Testing time 




HAWAII 


33 multiple choice 
42 constructed response 


75 


4 hours, 58 minutes 


2 hours, 58 minutes 




Florida 


28 multiple choice 
32 constructed response 


60 


— 


2 hours, 30 minutes 




Maryland 


26 multiple choice 

27 constructed response 


53 


3 hours 


2 hours, 30 minutes 




Massachusetts 


32 multiple choice 
28 constructed response 


60 


untimed 


2 hours (suggested) 




New Jersey 


30 multiple choice 
1 8 constructed response 


48 


2 hours, 26 minutes 


2 hours 




Ohio 


32 multiple choice 
1 3 constructed response 


45 


— 


2 hours, 30 minutes 




Texas 


59 multiple choice 
1 constructed response 


60 


untimed 


untimed 
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Hawaii's high school assessment in mathematics allots more points to items in 
a constructed-response format than do other states' tests. 

On average, the states Aehieve studied allotted 36 pereent of the points on 
their graduation tests to items having a eonstrueted-response format and 64 
pereent to those having a multiple-ehoiee format. In eontrast, Hawaii 
devotes a signifieantly larger proportion of test points to eonstrueted- 
response items (56 pereent) and a eorrespondingly smaller proportion to 
multiple-ehoiee items. Only two states in Aehieve’s exit exam analysis 
approaeh Hawaii’s distribution: Maryland assigns 51 pereent and Florida 
assigns 53 pereent of their test points to eonstrueted-response items. 
Hawaii’s deeision to emphasize eonstrueted-response items is a strength of 
its assessment. Generally speaking, the state used the item format as it is 
meant to be used, that is, for measuring eritieal thinking and reasoning 
skills and for solving multistep problems. 



Chart 13: Distribution of points by item type 
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NOTE: Totals may not equal 100 percent due to rounding. 
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CONTENT OF ITEMS 



Hawaii's assessment represents a good balance across the content domains; it 
gives more emphasis to algebra, geometry/measurement and data and less to 
number, as is appropriate for a high school test. 

To get a picture of the mathematics content state tests measure, Achieve cate- 
gorizes the distribution of test points according to the discipline’s four domains 
— number, algebra, geometry/measurement and data analysis. On average, the 
six states in our original study awarded the majority (66 percent) of the points 
students could earn to algebra and geometry/measurement (32 percent and 34 
percent respectively), followed by data (19 percent) and number (15 percent). 
Hawaii’s assessment follows a similar overall pattern in devoting 63 percent of 
the possible points to algebra and geometry/measurement. However, Hawaii 
awards slightly more points to algebra (39 percent) and slightly less to geometry/ 
measurement (24 percent) than the other states do on average. In comparison 
to the average of the other states, Hawaii also allocates a larger proportion of 
points to data (29 percent as compared with 19 percent) and a smaller propor- 
tion of points to number content (8 percent as compared with 15 percent). 



Number 



Algebra 



Geometry/Measurement 



Data 



Chart 14: Distribution of points by content strand 
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Table 8: Distribution of points by strand 



Strand 


HI 


Ft 


MD 


MA 


NJ 


OH 


TX 


6-state 

average 




Number 


8% 


25% 


8% 


13% 


27% 


11% 


10% 


16% 




Algebra 


39% 


25% 


42% 


40% 


25% 


27% 


50% 


35% 




Geometry/Measurement 


24% 


35% 


6% 


27% 


23% 


40% 


33% 


27% 




Data 


29% 


15% 


45% 


20% 


25% 


22% 


7% 


22% 





NOTE: Totals may not equal 100 percent due to rounding. 
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In comparing the distribution of points by strand for each state, we find that 
Maryland’s and Texas’ tests stress algebra to an even greater degree than 
does Hawaii’s test, although Maryland’s emphasis on algebra is attributable 
to its being an end-of-course test for Algebra I and Data Analysis. It is also 
worth noting that Hawaii gives less emphasis to number (8 percent of its 
points) than any of the other states. Because number topics tend to be 
learned at earlier grade levels, Hawaii’s lack of emphasis on this strand adds 
to the overall rigor of the assessment. 

Hawaii is second only to Maryland in percentage of points devoted to the 
data strand, which is noteworthy given the importance of data in today’s 
world. Maryland’s emphasis is due to the fact that its test is specifically 
intended to assess algebra and data analysis, which is atypical of state grad- 
uation exams. 

Hawaii's test inciudes more advanced aigebra topics than any other state test 
in this study. 

Because algebra is a prerequisite for success in credit-bearing college math- 
ematics courses and today’s high-skills workplace. Achieve closely examined 
the specific algebra content being assessed on each state test. Across the six 
states that participated in our original study, we found that a majority of the 
algebra points students can earn are associated with less-demanding algebra 
topics. In fact, the six states, individually and on average, dedicated 50 per- 
cent or more of their algebra points to assessing prealgebra concepts that 
most students learn prior to high school. These include such basic skills as 
working with integers, rational numbers, patterns, representation, substitu- 
tion, basic manipulation and simplification. The six states, on average, 
assign less than one-third of the total of their algebra points to concepts 
such as linear equations and basic relations and functions that are typically 
associated with basic algebra or Algebra I — a course commonly taken in 
the 9th grade or in many cases even earlier. Moreover, the states allocate an 
even smaller proportion of the algebra points (10 percent on average) to 
assessing advanced algebra concepts, such as non-linear functions, equa- 
tions, inequalities, and work with real and complex numbers. These con- 
cepts are typically encountered in Algebra II and generally considered 
essential for success in credit-bearing college mathematics courses. 
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Compared with the six states that partieipated in Aehieve’s original study, 
Hawaii assigns a signifieantly greater proportion of its items to assessing 
advaneed algebra eoneepts. In faet, a majority of Hawaii’s algebra points (57 
pereent) eome from items that assess advaneed algebraie understandings, 
greatly exeeeding the six-state average of just 10 pereent. For example, the 
Hawaii assessment ineludes items that extend beyond linear funetions and 
equations to inelude the non-linear — espeeially quadraties. Students also 
are ealled on to solve systems of linear equations and to display their under- 
standings of real and eomplex number systems. These are also the kinds of 
topies that distinguish grade 12 NAEP from grade 8 NAEP. 



Chart 15: Distribution of algebra items by category 
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Hawaii gives the same weight to two-dimensional geometry as the other 
states do, on average. 

Achieve found that 54 percent of the geometry/measurement points on the 
state tests were associated with two-dimensional geometry and measure- 
ment, with the exception of New Jersey, which favored this area of study 
with 82 percent of its geometry points. Hawaii, on track with the other five 
states, allots 56 percent of its geometry points to two-dimensional geometry. 
States gave significantly less attention — 19 percent on average of geometry 
test points — to three-dimensional geometry that includes concepts such as 
volume and surface area. Geometry tends to be less hierarchical than algebra, 
so two-dimensional geometry is not necessarily less challenging than three- 
dimensional geometry. It is worth noting, however, that NAEP includes two- 
dimensional geometry and measurement on its 8th grade assessment, but it 
includes formal three-dimensional geometry on its 12th grade assessment, 
indicating that three-dimensional geometry is considered to be end-of-high 
school level content. Measurement, including such concepts as units and 
estimation of measurements, was not a major focus of any of the six gradua- 
tion tests Achieve analyzed, nor was trigonometry. Hawaii, however, along 
with Ohio and Texas, includes a few items that assess knowledge of basic 
right-triangle trigonometry. 



Table 9: Distribution of points by content: Geometry/measurement 



Geometry area 


HI 


Ft 


MA 


NJ 


OH 


TX 


5-state 

average* 




Congruence, Similarity, Transformations 


11% 


33% 


19% 


18% 


28% 


15% 


23% 




2D Geometry and Measurement 


56% 


48% 


44% 


82% 


50% 


50% 


55% 




3D Geometry and Measurement 


28% 


14% 


38% 


0% 


11% 


30% 


19% 




Basic Measurement 


0% 


5% 


0% 


0% 


6% 


0% 


2% 




Trigonometry 


6% 


0% 


0% 


0% 


6% 


5% 


2% 





NOTE: Totals may not equal 100 percent due to rounding. 

*Because the Maryland exam focuses on Algebra and Data Analysis, it includes only three questions within the realm of 
geometry. These items constitute too small a sample to merit inclusion in this comparison. 
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RIGOR OF THE ASSESSMENT 



For the most part, Hawaii's assessment, like the other state tests Achieve 
examined, measures mathematics content that students in other countries 
study prior to high school. 

Because the performance of U.S. high school students in mathematics lags 
behind that of students in other industrialized countries, it is valuable to 
compare what is expected of students on these tests with expectations in 
other countries. In our exit exam study, Achieve had the advantage of look- 
ing at the mathematics exams through the lens of the International Grade 
Placement (IGP) index developed by Michigan State University as part of its 
ongoing work on the TIMSS. 

The IGP index represents an “average” or composite among 41 nations of 
the world (both high-performing and low-performing countries) as to the 
grade level in which a mathematics topic typically appears in the curricu- 
lum. For example, since decimals and fractions tend to be taught at the 4th 
grade level internationally, this topic has an IGP rating of four. Right-triangle 
trigonometry, on the other hand, tends to be taught in the 9th grade around 
the world, so it receives an IGP rating of nine. 

When applied to assessment items, the IGP describes content only. It is not 
intended to reflect cognitive or performance demands, nor item format — 
these are captured by another dimension of Achieve ’s methodology. When 
Achieve applied the IGP index to the six states’ exit exams, we found that 
the average of the content measured on the tests is at the 8th grade level 
internationally. In other words, the material on the exams the six states are 
using as a requirement for high school graduation is, on average, considered 
middle school content in most other countries. While there was some varia- 
tion across the states, no test had an average IGP rating higher than the 9th 
grade. The range of average IGP values across the six tests in the Achieve’s 
study extended from a low of 7.3 for Florida to a high of 8.8 for Maryland. 
When compared with these six states, Hawaii’s average IGP of 8.3 ranks sec- 
ond only to Maryland’s end-of-course Algebra test (8.8). 

Hawaii’s test attributes more points to advanced algebra concepts than does 
Maryland, and on that basis we would expect a higher IGP for Hawaii than 
Maryland. However, Maryland’s test is a test of Algebra and Data Analysis, 
and items having to do with data have a relatively high IGP rating because 
most countries teach data at higher grade levels than does the United 
States. This attribute of the IGP boosts Maryland’s average IGP relative to 
the other states. 
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Chart 16: Average IGP 



HAWAII 

Florida 
Maryland 
Massachusetts 
New Jersey 
Ohio 
Texas 
6-state average 



8.3 



8.8 



7.6 

7.7 




8.0 

7.9 



7.8 



“I 1 i 1 1 

2.0 4.0 6.0 8.0 10.0 



COGNITIVE DEMAND 

The majority of points on Hawaii's mathematics test are attributabie to items 
at the middie to iower end of the cognitive demand continuum. 

The content measured by mathematics items tells an important part of the 
story, but a more complete understanding of what these tests measure 
requires an examination of the cognitive demand of the items as well. At 
issue is what students are actually required to do with the content. For 
example, does an item ask students to apply a routine procedure to a math- 
ematical problem, or is the item framed in such a way that it requires stu- 
dents to first develop a more complex mathematical model to solve the 
problem? The scale Achieve has devised to measure cognitive demand 
is designed to capture the processes that students employ as they “do” 
mathematics. 

In our original study. Achieve found that a majority of the points on the 
tests across the six states were associated with items that require students 
to employ processes at the lower end of the cognitive continuum. On a five- 
point scale of rigor, with one being the least demanding performance and 
five being the most demanding, slightly more than half the points across the 
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six tests Achieve studied were tied to the lowest two levels. The cognitive 
demand profile of Hawaii’s test varies somewhat from the average of the 
other state tests. Hawaii’s test allots a greater proportion (12 percent) of its 
points to items that require recall (Level 1) than does any other state test, 
with the average being just 3 percent. But in its allocation of 49 percent of 
its test points to Level 2 items — items that require students to use routine 
procedures and tools to solve mathematics problems — Hawaii squares with 
other states’ average allocation of 48 percent of points. Similarly, Hawaii 
meets the six-state average of 26 percent allocation of test points to Level 3 
— using non-routine procedures. However, it falls out of line in terms of the 
points it awards to assessing advanced mathematical skills represented in 
Level 4 (formulating problems and strategizing solutions) and Level 5 
(advanced reasoning), allocating a total of 14 percent of item points to these 
upper levels as compared with the state average of 22 percent. 



Table 10: Distribution of points by level of cognitive demand 



Cognitive demand level 


HI 


Ft 


MD 


MA 


NJ 


OH 


TX 


6-state 

average 




1 : Recall 


12% 


2% 


4% 


2% 


8% 


4% 


2% 


3% 




2: Using Routine Procedures 


49% 


50% 


30% 


53% 


46% 


53% 


48% 


48% 




3: Using Non-Routine Procedures 


25% 


33% 


32% 


22% 


27% 


27% 


23% 


26% 




4: Formulating Problems and 
Strategizing Solutions 


11% 


8% 


19% 


8% 


15% 


16% 


22% 


14% 




5: Advanced Reasoning 


3% 


7% 


15% 


15% 


4% 


0% 


5% 


8% 





NOTE; Totals may not equal 100 percent due to rounding. 



Clustering items in three larger categories of low, medium and high demand 
helps reveal the underlying structure of Hawaii’s math test. As noted above, 
only 22 percent of the points across all of the tests are attributed to items 
that require more advanced mathematical skills (Level 4 and Level 5). 
However, Hawaii’s distribution of points is a less challenging one than any of 
the other states, placing the least emphasis on the highest levels of cognitive 
demand. Only 1 1 percent of Hawaii’s test points are attributable to Level 4 
items that require students to formulate a problem, to strategize or to cri- 
tique a solution method. And only 3 percent of Hawaii’s points correspond 
to Level 5 items, which ask students to develop algorithms, generalizations, 
conjectures, justifications or proofs. 
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Chart 17: Distribution of points by level of cognitive demand 



HAWAII 



Florida 



Maryland 



Massachusetts 



New Jersey 



Ohio 



Texas 




61% 




33% 




34% 



34% 






52% 




55% 




54% 




58% 




50% 



6-state average 




50% 



I Low (Levels 1 and 2) 
I Middle (Levels 3 and 4) 
I High (Levels 5 and 6) 



0% 10% 20% 30% 40% 50% 60% 70% 80% 

Percentage of total points 



NOTE: Totals may not equal 100 percent due to rounding. 



Hawaii’s overall pattern of allotting a high proportion of test points to items 
of low eognitive demand (reeall and using routine proeedures) holds true at 
the strand level. Exeept for data, Hawaii’s eognitive demand for the major 
strands is below the average of the other states, espeeially in the number 
and geometry strands. 
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Number 

Algebra 

Geometry/Measurement 

Data 

Total 



Chart 18: Average cognitive demand by strand, Hawaii versus six-state average 




I Hawaii 
I 6-state average 




2.9 




2.4 
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3.0 




2.8 
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SUMMARY OF FINDINGS 

The content of Hawaii’s grade 10 assessment in mathematics is more rigor- 
ous than all but one of the states analyzed due to its relatively high propor- 
tion of advanced algebra. Hawaii’s test also is well balanced, giving more 
emphasis to geometry, algebra and data and less to number concepts, as is 
appropriate for a high school test. However, the relatively low cognitive 
demand of Hawaii’s test items in mathematics reduces the overall challenge 
of the test. This is offset in part by the state’s substantial allotment of item 
points to robust constructed-response items. Achieve ’s analysis indicates 
that Hawaiian students find items in this format to be more challenging to 
answer than multiple-choice items — even though they may not in fact 
involve higher-level cognitive skills. (See section, “What makes the Hawaii 
mathematics assessment so challenging for students?” for a discussion of 
the additional analysis of Hawaii’s mathematics test that Achieve conducted.) 
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IV. How do the performance levels on Hawaii's grade 10 
assessments in reading and mathematics compare with 
those of other states? 



The aim of a standards-based education system is for all students to acquire 
the knowledge and skills described by a state’s content standards. State 
assessments are the principal tool for measuring how well students have mas- 
tered that content. Up until this point, this report has focused on what is 
measured on Hawaii’s assessment in comparison to six states’ exit exams — 
the content, the difficulty level of the items and the complexity of the reading 
passages. However, students taking these tests are not required to answer all 
of the questions correctly to pass. States establish cut scores that students 
need to reach to pass the tests. These cut scores define the level of achieve- 
ment that students are ultimately held accountable for — they establish the 
floor of performance expected of high school graduates. As such, these scores 
represent the level of mastery that a state deems satisfactory. 

The Hawaii Department of Education asked Achieve to compare the “Meets 
Proficiency” level on its grade 10 assessments in reading and mathematics 
with the cut scores students had to reach to pass the equivalent tests in the 
six states that participated in Achieve’s 2004 study of graduation exams. 

Methodology 

Comparative studies of where states set their cut scores are rare and difficult 
to conduct. They typically involve comparing the percentage of students pass- 
ing each state’s test with the percentage of students passing a common test, 
such as NAEP. This approach permits judgments about the relative difficulty 
of different tests, but it does not provide information on the knowledge and 
skills students need to pass each test. 

Achieve, working with researchers from Michigan State University (MSU), 
developed a new procedure for comparing cut scores across state tests that 
focuses on the content of the test questions, thus giving states a broader com- 
parative picture of their expectations for students. The procedure was first 
used in Do Graduation Tests Measure Up? — published in June 2004 — and 
has been replicated for the analysis of Hawaii’s assessment. Because the items 
on Hawaii’s assessment and the six other state tests have been coded accord- 
ing to common metrics discussed previously in the report (e.g., content and 
cognitive demand), it is possible to use these metrics to identify what a typi- 
cal student passing the assessments is likely to know and be able to do. 

Performance Levels on the Reading Test 

Achieve compared cut scores across the English language arts tests using the 
ACT skills hierarchy. As stated earlier, ACT indicates that Level 1 through 
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Level 3 are most heavily assessed on its EXPLORE test, whieh is given to 8th 
and 9th graders. Its PLAN test, given to 10th graders, foeuses most heavily on 
Level 3 through Level 5 questions, while the eollege admissions exam — the 
ACT Assessment — foeuses on Level 4 through Level 6. 

Given this frame, Aehieve found that the average ACT skill level at the pass- 
ing seore on the state exit exams in the original study ranged from 2.1 to 3.5. 
Thus, students seoring at the passing level are, generally speaking, being 
asked to perform below the level that ACT eonsiders appropriate for 8th and 
9th graders. 

This finding holds true for Hawaii’s reading test. The average ACT s ki ll level at 
the “Meets Profieieney” seore (303) on Hawaii’s reading assessment is 2.9, 
exaetly matehing the average aeross the other six states. A eomparison of 
Hawaii’s rating with the other states su^ests that seoring “Meets Profieieney” 
on Hawaii’s reading assessment is about as ehallenging as passing Florida’s, 
Massaehusetts’ and Maryland’s exit exams, and eonsiderably more ehallenging 
than passing Ohio’s test. With average ACT skill levels of 3.5 and 3.2, respee- 
tively, the New Jersey and Texas tests appear to be the most ehallenging ones to 
pass among the seven tests, whieh is not surprising given the relatively high 
level of eontent and eognitive demand in these tests. (Note: Item format is not 
eonsidered part of this seale.) It also is worth noting that New Jersey and Texas 
administer their tests in the 11th grade, whereas most of the other states, 
ineluding Hawaii, administer their tests in 10th grade. The exeeption is 
Maryland, whose test is an end-of-eourse test that is administered at the end of 
the 9th grade. 
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Performance Levels on the Mathematics Test 



As described earlier, Achieve used the IGP index to identify the average level of 
content measured on the tests. In our original study, we found that, on average, 
the tests from the six states measured mathematical content that tends to be 
focused at the 8th grade level internationally. The level of mathematics content 
knowledge students need to pass the state exit exams ranged from 7.1 to 8.6. 
That is, the questions on the tests that students scoring at the cut score are 
likely to get correct measure, on average, concepts that students around the 
world focus on in the 7th and 8th grades. 

The average IGP score at Hawaii’s “Meets Proficiency” cut score (300) is 8.3. 
Essentially, this means that to pass Hawaii’s grade 10 mathematics assessment, 
students are required to know mathematics content that is taught, on average, 
in early 8th grade internationally. Hawaii’s score of 8.3 also indicates that pass- 
ing its mathematics assessment offers a similar challenge passing in terms of 
its content difficulty as Ohio, Texas and Massachusetts. Hawaii’s assessment 
exceeds the content demand of New Jersey and Florida. Only Maryland’s end- 
of-course test in Algebra and Data Analysis received a higher IGP score than 
Hawaii. Hawaii’s high score is likely due to the emphasis the mathematics 
assessment gives to advanced algebra. As noted previously, the average IGP 
score for Maryland is elevated somewhat by data topics placing relatively high 
on the IGP scale because internationally these topics fall later in the grade cur- 
riculum sequence. Again, it is content — not cognitive demand or item format 
— that is the basis for the IGP index. 
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What makes the Hawaii mathematics assessment so challenging for students? 



Hawaii is rightly concerned that large numbers of students are not passing the 
state mathematies assessment. The pereentage of students who attained the 
“Meets Profieieney” or “Exeeeds Profieieney” eut seores in mathematies in 
2005 are 18 pereent and 2 pereent, respeetively. The passing rate, whieh 
ineludes both eategories, has been relatively stable: 19 pereent of Hawaii stu- 
dents passed the mathematies test in 2002, 18 pereent passed in 2003, 21 
pereent passed in 2004 and 20 pereent passed in 2005. 

What is it that makes Hawaii’s mathematies assessment so ehallenging for stu- 
dents? One important faetor is that Hawaii’s test, as indieated by its average 
IGP value, eontains more demanding eontent than all but one state test. This 
is due to the faet that Hawaii allots more points to advaneed algebra topies 
than do other states. In addition, Hawaii appropriately ineludes basie right- 
triangle trigonometry, as do only two of the other states. Hawaii also ineludes 
a substantial number of robust eonstrueted-response items, whieh students 
seem to find more diffieult. (Aehieve’s analysis of Hawaii’s test data in mathe- 
maties revealed that students seem to have substantially less diffieulty 
answering multiple-ehoiee items.) 

Nonetheless, Aehieve did not find the Hawaii test to be too ehallenging. As in 
the other state tests Aehieve analyzed, the majority of items on Hawaii’s test 
assess eontent that students study prior to and early in high sehool. In addi- 
tion, Hawaii’s test items fall in the low range of overall eognitive demand, as 
eompared to the six states in Aehieve’s study. The majority of Hawaii’s test 
points are awarded to items that make minimal demands — “reeall” (12 per- 
eent) or “use routine proeedures” (49 pereent) — or modest demands — “use 
non-routine proeedures” (25 pereent). 

Are there other eharaeteristies of the test that eould aeeount for the relatively 
low passing rates that are not easily eaptured by the eriteria in this study? 
Aehieve asked the Hawaii Department of Edueation to provide eourse-taking 
patterns, meaning the pereentage of students who had eompleted or were 
enrolled in Algebra 11 at the time the test was administered in 2005 (i.e., in 
the spring of grade 10). Sinee the Department of Edueation does not yet have 
a data system in plaee eapable of traeking the number of students enrolled in 
Algebra 11 and in what grade, it asked registrars to make an edueated guess. 
Following are the grade-by-grade estimated pereentages of students enrolled 
in Algebra 11 that the Department of Edueation shared with Aehieve: Grade 9 
is 1 pereent, grade 10 is 29 pereent, grade 11 is 50 pereent and grade 12 is 20 
pereent. Given this data, one possible reason for the low number of Hawaiian 
high sehool students who seore at the “Meets Profieieney” level on the grade 
10 assessment is the eontrast between the relatively high proportion of 
advaneed algebra eontent on the assessment and the relatively low proportion 
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of students who had been taught the eontent by the time they took the test. It 
seems that the majority (70 pereent) of Hawaii’s students are not enrolled in 
Algebra II until grades 11 or 12, plaeing them at a signifieant disadvantage in 
terms of responding to items based on advaneed algebra eontent. It is also 
important to keep in mind that even those students enrolled in Algebra II in 
grade 10 are taking the assessment before they have eompleted the eourse- 
work, as the test is administered in Mareh or April. It is Aehieve’s understand- 
ing that the state already is aeting to eorreet this preparation gap by revising 
its high sehool standards and ensuring that the new assessment will be tightly 
aligned to the revised standards. 

Three other minor faetors also may be eontributing to low student performanee: 

1. A laek of motivation, as the test does not eount for students; 

2. A laek of suffieient exposure in eoursework to eonstrueted-response items, 
whieh present a signifieantly higher level of ehallenge to Hawaii’s grade 10 
students than do multiple-ehoiee items; and 

3. The faet that students are not allowed to use a ealeulator on any portion of 
the test, in eontrast to most other states. 

There is growing evidenee from other states that high sehool students take 
standards and assessments more seriously when they know that their per- 
formanee on those tests eounts. For example, only 48 pereent of 10th graders 
in Massaehusetts passed the mathematies portion of the state’s new gradua- 
tion exam when it was first given in 1998. Some ealled for the state to lower 
the bar or delay implementation, but instead state offieials and loeal eduea- 
tors redoubled their efforts to strengthen the eurrieulum and provide a variety 
of aeademie supports. When the 10th graders from the elass of 2003 took the 
test — the first group that had to pass it to graduate — the seores jumped up 
nearly 20 pereentage points, suggesting that when it eounts, students (and 
sehools) put forth more effort. By spring 2003, 95 pereent of students in the 
graduating elass had passed the test. 

A similar story played out in Virginia as it phased in new end-of-eourse 
exams for high sehool graduation. Only 40 pereent of students passed the 
Algebra I exam when it was first given in 1998 (more students passed the 
reading and writing tests). By 2003, 78 pereent passed the Algebra I test, 
and by the time the first elass of high sehool seniors had to pass several of 
the end-of-eourse tests to graduate in spring 2004, all but 1 pereent earned 
their diplomas. 
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The combination of low student motivation and the significant role that 
constructed-response items play in Hawaii’s assessment may help explain 
the state’s low passing rate in mathematics. As noted previously, Hawaii’s 
mathematics assessment includes a significant number of constructed- 
response items (56 percent of total points), particularly when compared with 
other states (36 percent average across the other six states). Items in this 
kind of format tend to be challenging because they often require a substantial 
amount of reading and/or reasoning through multiple steps to solve a prob- 
lem. Perhaps most significantly, constructed-response items do not provide 
the crutch of offering a set of responses, one of which is correct and some- 
times verifiable by guessing and checking. Achieve verified that students, 
on average, got a lower percent of constructed-response items (30 percent) 
correct than multiple-choice items (43 percent). Constructed-response items 
are essential because they have the ability to measure more advanced skills 
and closely reflect the kind of tasks students will face in college courses and 
the workplace. However, if students are not used to routinely solving open- 
ended problems, their format can pose an additional challenge. 

It is possible that — because they know the assessment doesn’t count for 
graduation — students are not putting forth the necessary effort to complete 
constructed-response items. To test this hypothesis, MSU researchers looked 
at the student-response data in mathematics to see how students performed 
on each type of question. Their findings were revealing. Again, statistically 
significant differences were found by item type. On average, constructed- 
response items are more than twice as likely to be skipped by students (19 
percent) as multiple-choice items (8 percent). We cannot know the mindset 
of these students at the time they took the test, but it is conceivable that they 
simply were unmotivated to complete the more demanding items because the 
test does not count for them. 

In revising its assessment, Hawaii will need to ensure that the new test is 
fully aligned with the new standards and thus fair to students. At the same 
time, Hawaii will want to ensure its grade 10 assessments are rigorous (i.e., 
include the more demanding content and performances delineated in the 
revised standards). Indeed, Hawaii should be careful not to reduce the 
overall rigor of its assessment, nor reduce the proportion of constructed- 
response items, nor lower the cut score for “Meets Proficiency.” In fact, 

Achi eve’s American Diploma Project found that states need to educate their 
high school graduates to achieve a far higher level of math proficiency than 
they currently are accomplishing, if students are to succeed in college and a 
workplace increasingly steeped in quantitative analysis. 
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V. Conclusion 



Achieve launehed its original 2004 study to help answer some basie questions 
about the expeetations states are setting for their high sehool graduates 
through the use of exit exams. Do the tests refleet material that students 
should be familiar with by the time they eomplete high sehool? Is it reason- 
able to expeet all students to pass these tests before they graduate? If they 
pass these tests, does it mean students are ready for their next steps in life? 

Aeross the states, we found that the tests do indeed set a floor for students 
that ean be responsibly defended as a graduation requirement. However, they 
do not effeetively tap the higher-level skills that truly eonstitute “readiness” 
for eollege and work. Generally speaking, our findings for Hawaii are eonsis- 
tent with those of our previous study, with a few eritieal differenees. 

■ In reading, Hawaii’s assessment is somewhat less rigorous than other tests 
in the study. Although the test ineludes ehallenging questions and empha- 
sizes informational text, the reading passages are not as rigorous as in other 
states, whieh diminishes its overall rigor. The result is that the “Meets 
Profieieney” level of performanee sets a standard that is roughly eomparable 
to that of other states in Aehieve’s study. 

■ In mathematies, Hawaii’s assessment ineludes more rigorous algebra eon- 
tent than the other tests we studied, making it a more effeetive measure of 
the knowledge and skills that are important for eollege and work. At the 
same time, however, the average eognitive demand of the items is relatively 
low eompared to that of other tests. The result is that Hawaii’s “Meets 
Profieieney” level is not signifieantly different than that of four of the six 
partieipating states in Aehieve’s study and does not, in our opinion, set an 
unreasonable standard for high sehool graduates. 

Recommendations for Improvement 

As Aehieve found in its original study of graduation exams in six states, the 
Hawaii tests set a reasonable floor of expeetation that should be raised over 
time. As Hawaii moves forward, Aehieve reeommends that the state: 

■ Raise the overall rigor of the grade 10 reading test. Hawaii should 
inerease the demand and eomplexity of the reading passages on its assess- 
ment. Some passages on the reading test should represent the level of 
demand typieal of instruetional materials written at a late high sehool read- 
ing level to raise the eeiling on the test and signal the level of text students 
need to eomprehend to be on traek for attainment in postseeondary eduea- 
tion and the new eeonomy. In addition, Hawaii should add items that tap 
the highest level of eognitive demand, requiring students to more deeply 
analyze text. 
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■ Phase in higher eut seores on the reading test over time. In addition to 
increasing the cognitive demand of its passages and/or items, Hawaii can 
raise the rigor of its reading test over time by raising the score required for 
passing. Texas is using this approach with its new graduation exam. This 
strategy works only if a test has enough range in what it measures so that 
a higher score actually reflects more advanced knowledge and skills. If a 
higher cut score simply means that students must answer more of the same 
kinds of items correctly, rather than items tapping more advanced concepts 
and skills, it is not very meaningful to raise the cut score. 

■ Raise the level of performance demand of the mathematics items. 
Although the content on the math test is challenging, the items themselves 
tend to make low-level demands in terms of performance. Hawaii should 
raise the cognitive demand of its items by increasing the proportion of 
items that require complex problem-solving skills, problem formulation and 
advanced reasoning. 

■ Build assessments of college and work readiness. While the Hawaii grade 
10 assessments set a reasonable floor for students, over time, Hawaii will 
need to go beyond its grade 10 test and develop a more comprehensive set 
of assessments that measure the full set of knowledge and skills that indi- 
cate readiness for college and work. 

In addition, Hawaii will need to work with its local districts to establish a 
systematic method for evaluating important skills such as research and oral 
communication that are not easily assessed on a paper-and-pencil test. 

Additional Recommendations 

As Hawaii raises its standards over time, it will face the simultaneous chal- 
lenge of raising student achievement. To help meet this challenge, particularly 
in mathematics, Hawaii should: 

■ Ensure the tests are aligned with the state standards and vertically aligned 
with each other, and ensure students are exposed to content prior to taking 
the assessment. 

■ Conduct analyses to determine as precisely as possible why the percentage 
of students scoring at or above the “Meets Proficiency” level on the grade 8 
test is low and provide targeted professional development to help teachers 
teach content and skills more effectively, including maximizing students’ 
exposure to constructed-response items. Shoring up middle school perform- 
ance will help ensure the larger majority of students will reach the “Meets 
Proficiency” level on the grade 10 test. 
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■ Publish the state eompendium of sample items, seoring rubries and sample 
students responses on the state Web site to more widely inform the publie 
of state expeetations. 

■ Identify and disseminate the most effeetive instruetional materials available, 
taking full advantage of the distribution power of the Internet. 

Achieve's ADP Network 

Aehieve eneourages Hawaii to eonsider joining its ADP Network, the group of 
22 states that have pledged themselves to a poliey agenda in support of 
preparing students for sueeess in eollege and work by the time they graduate 
from high sehool. To elose the expeetations gap, the ADP Network states have 
eommitted to the following four aetions. 

■ Aligning high sehool standards and assessments with the knowledge and 
skills required for sueeess after high sehool; 

■ Requiring all high sehool graduates to take ehallenging eourses that aetu- 
ally prepare them for life after high sehool; 

■ Streamlining the assessment system so that the tests students take in high 
sehool also ean serve as readiness tests for eollege and work; and 

■ Holding high sehools aeeountable for graduating students who are ready 
for eollege or eareers and holding postseeondary institutions aeeountable 
for students’ sueeess onee enrolled. 

Although the Network has been in existenee for just over a year, Aehieve has 
already seen evidenee of substantial progress on the part of partieipating 
states. Hawaii should take advantage of the opportunity to join in partnership 
with these other states so Hawaiian students will be ready to take on the ehal- 
lenges of living and working in a global eeonomy, equipped with the knowl- 
edge and s ki lls required for sueeess. 
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Appendix: Summary of Methodology 



To compare assessments, eaeh assessment item was analyzed and eoded on 
the basis of distinguishing attributes to eapture different eharaeteristies of 
individual test items and the tests as a whole. Many of the eriteria in read- 
ing and mathematies are similar, although there are important differenees 
that stem from the distinet natures of the diseiplines. To ensure the reliabil- 
ity of the data, at least two experts trained in the use of the eriteria eoded 
eaeh test. Those experts reeoneiled any differenees in ending before the 
data were further analyzed. 

The following are summaries of the various eriteria aeeording to whieh 
assessments in the study were analyzed. 

Content of Items 

Mathematics 

To elassify the eontent on state mathematies assessments, Aehieve used the 
Third International Mathematies and Seienee Study (TIMSS) Mathematies 
Framework, adapted by the U.S. TIMSS National Researeh Center at Miehigan 
State University and Aehieve experts. The framework provides a detailed, 
eomprehensive taxonomy of mathematies eontent, organized at its most gen- 
eral levels aeeording to the following major domains of mathematies: 

■ Number 

■ Algebra 

■ Geometry/Measurement 

■ Data 

These domains are further broken down into smaller units to allow for finer- 
grained eomparisons. For example, geometry eontent is divided into a vari- 
ety of eategories sueh as two-dimensional geometry and measurement; 
three-dimensional geometry and measurement; transformations, eongru- 
enee and similarity; and trigonometry. The majority of these eategories are 
subdivided even further to faeilitate a high degree of eontent speeifieity in 
ending. Item eoders for this study assigned up to three primary eontent 
eodes to eaeh test item. In many eases, the multiple eontent eodes aligned 
with the same reporting eategory (e.g., geometry/measurement or algebra), 
but this was not always the ease. Items that aligned with more than one 
reporting eategory were re-examined, and one primary eode was identified. 

Reading 

To identify the eontent on reading assessments for its original six-state 
study (Do Graduation Tests Measure Up?), Aehieve adapted a eomprehen- 
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sive listing of the domain of reading, originally developed by the Gouneil of 
Chief State Sehool Offieers (GGSSO) in eollaboration with several states for 
its Survey of Enaeted Gurrieulum. The list was intended to fully eneompass 
all topies addressed in reading elasses from the primary to the seeondary 
level. 

Based on this framework, Aehieve developed a taxonomy that ineluded all 
the aspeets of reading deseribed in state standards — and therefore targeted 
on state tests — to deseribe as aeeurately as possible the eontent or topie 
that eaeh item measured. The listing used in Aehieve ’s study has been 
revised to more elearly refleet the topies addressed by test items at the see- 
ondary levels. Beeause the list was originally developed to eover all grades 
from kindergarten through grade 12, some eodes that were irrelevant for 
higher-level tests have been deleted. In addition, the listing has been reor- 
ganized to elarify the relationship of the elements. 

In Aehieve ’s original study, the major reporting eategories for reading were 
as follows: 

■ Basie eomprehension (ineludes word definitions, main idea, theme and 
purpose) 

■ Literary topies (ineludes figurative language, poetie teehniques, plot and 
eharaeter) 

■ Informational topies (ineludes strueture, evidenee and teehnieal elements) 

■ Gritieal reading (ineludes appeals to authority, reason and emotion; validity 
and signifieanee of assertion or argument; style in relation to purpose; and 
development and applieation of eritieal eriteria) 

In this previous grouping, some of the elements overlapped. To streamline 
and elarify reporting, Aehieve has regrouped some of the eodes into more 
diserete eategories. For example, argument and assertion, formerly under 
eritieal reading, are both aspeets of persuasive texts and now are grouped 
under informational/persuasive elements. Codes also have been realigned 
into groupings that refleet all of the elements within them. For example, all 
the literary elements have been grouped together — narrative elements 
with the author’s eraft elements. Additionally, some elements formerly 
ineluded in the eritieal reading eategory were deleted beeause they eom- 
bined referenees to both eontent and eognitive demand, sueh as determin- 
ing the validity of an assertion. 
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This revision yields four different major eategories of eodes for reading: 

1. Voeabulary (ineludes word definitions) 

2. General eomprehension (ineludes purpose and main idea) 

3. Literary elements (ineludes figurative language, plot and eharaeter, 
theme, setting, and poetie language) 

4. Informational/persuasive elements (ineludes organization and 
strueture, assertions, evidenee and teehnieal elements) 

Approximate Grade-Level Demand of Items 

Mathematics 

To approximate the grade-level demand of mathematies items, Aehieve used 
the TIMSS International Grade Plaeement (IGP) index, developed by the 
U.S. TIMSS National Researeh Center at Miehigan State University. The IGP 
index represents a kind of eomposite among the 40 TIMSS eountries (other 
than the United States) to show when the eurrieulum foeuses on different 
mathematies eontent — at what point the highest eoneentration of instrue- 
tion on a topie oeeurs. Using their nation’s eontent standards doeument, 
edueation ministry offieials and eurrieulum speeialists in eaeh TIMSS eoun- 
try identified the grade level at whieh a mathematies topie is introdueed 
into the eurrieulum, foeused on and eompleted. The IGP index is a weighted 
average of those determinations. For example, a topie with an IGP of 8.7 is 
typieally eovered internationally toward the end of 8th grade. The eontent 
topies to whieh Aehieve eoded test items all have an IGP value assoeiated 
with them. For items that spanned more than one eategory and were subse- 
quently assigned a single eode, the retained eontent eode tended to be that 
with the highest IGP value. 

The following are examples of the IGP ratings of various mathematies topies. 



CONTENT DESCRIPTION 


IGP INDEX 


Whole Number: Operations 


2.5 


Rounding and Significant Figures 


4.7 


Properties of Common and Decimal Fractions 


5.6 


Exponents, Roots and Radicals 


7.5 


Complex Numbers and Their Properties 


10.7 
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Reading 



To approximate the grade-level demand of reading items, Aehieve adapted 
the ACT Standards for Transition (for reading), whieh provide a hierarehy 
of skills in these topie areas by taking into aeeount the performanee and 
eontent of an item as well as the related demand of the reading passage. 
ACT, Ine.’s Edueational Planning and Assessment System eneompasses 
three assessments administered during 8th and 9th grades, 10th grade, and 
11th and 12th grades. The Standards for Transition form the basis of all 
three, with eaeh sueeessive test in eluding more eomplex eontent and per- 
formanees from the standards. The standards are divided into six levels: 

■ Levels 1 through 4 are assessed on the EXPLORE test (8th and 9th grades); 

■ Levels 1 through 5 are assessed on the PLAN test (10th grade); and 

■ Levels 1 through 6 are assessed on the ACT Assessment (11th and 12th 
grades). 



The following is an example of the most advaneed three levels of one stan- 
dard from the Reading Standards for Transition. 



STANDARD 


COMPARATIVE RELATIONSHIPS 


LEVEL 4 


Have a sound grasp of relationships between people and ideas in uncomplicated passages. 

Identify clearly established relationships between characters and ideas in more challenging 
literary narratives. 


LEVEL 5 


Reveal an understanding of the dynamics between people and ideas in more challenging 
passages. 


LEVEL 6 


Make comparisons, conclusions and generalizations that reveal a feeling for the subtleties in 
relationships between people and ideas in virtually any passage. 




Cognitive Demand of Items 




Mathematics 




Achieve developed a taxonomy of performance expectations (i.e., what 
students are expected to “do” with the mathematics content they know) 
based on a synthesis of the TIMSS Mathematics Framework and Achieve ’s 
assessment-to-standards alignment work with states. This taxonomy was 
categorized into five levels of cognitive demand of mathematics items. The 
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five levels deseribe the kind and eomplexity of performanee required of test- 
takers — ranging from simple reeall of information to eomplex reasoning 
skills. 

■ Level 1 ineludes demonstrating basie knowledge or reeall of a faet or 
property. 

■ Level 2 ineludes routine problem solving that asks students to do sueh 
things as eompute, graph, measure or apply a mathematieal transformation. 

■ Level 3 ineludes estimating, eomparing, elassifying and using data 
to answer a question or making deeisions that go beyond a routine 
problem-solving aetivity. 

■ Level 4 ineludes formulating a problem, as well as strategizing or 
eritiquing a solution method. 

■ Level 5 ineludes asking students to develop algorithms, generalizations, 
eonjeetures, justifieations or proofs. 



Coders often assigned multiple performanee eodes to items. Sometimes pri- 
mary performanee eodes for an item spanned two or more of the reporting 
levels. In eases sueh as this, eaeh item was re-examined, and a deeision rule 
was made to aeeept the highest performanee level eategory as representing 
the performanee expeetation of that item. 



Reading 

In its original study of six states’ graduation exams, Aehieve used a taxonomy 
of performanee expeetations derived from GGSSO’s deseription of perform- 
anees in its Survey of Enaeted Gurrieulum and influeneed by Aehieve’s 
assessments-to-standards alignment protoeol. This taxonomy was eatego- 
rized into four levels of eognitive demand of reading items. The four levels 
provided information on the kind and eomplexity of reasoning required of 
students, ranging from simple reeall of information to eomplex reasoning 
skills. The former eategories were as follows: 

■ Literal reeall 

■ Infer 

■ Explain 

■ Analyze 
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Some revisions have been made to the former eategories, informed by a 
reeently revised taxonomy based on Bloom developed by Anderson and 
Krathwohl in 2001. The revised taxonomy retains mueh of Bloom’s 1956 
model. However, the original Bloom taxonomy eombined both eontent and 
performanee, while the revised taxonomy separates the eontent from the 
performanee. Several of the original eodes in the former Aehieve eognitive 
demand seale also ineluded eontent topies as well as proeesses, making the 
eodes for some items redundant, sueh as “identifying main ideas or theme,” 
where both main idea and theme are now eoded with an appropriate eon- 
tent eode. 

The revised seale retains some of the same headings as the original, with 
some expansion that allows for a better diserimination among eognitive 
proeesses typieally assessed in reading tests: 

■ Reeall (ineludes loeating and reeognizing) 

■ Low inferenee (ineludes paraphrasing and generalizing) 

■ High inferenee (ineludes eoneluding, eomparing and illustrating) 

■ Gonstruet (ineludes summarizing and explaining) 

■ Analyze (ineludes diseriminating and outlining) 

■ Evaluating (ineludes eritiquing) 

■ Creating (ineludes designing and hypothesizing) 

Categories six (evaluating) and seven (ereating) are not eharaeteristie of 
items found on large-seale, on-demand state tests and are, therefore, not 
ineluded in the related data eharts. 

Demand of Reading Passages 

Aehieve analyzed the diffieulty level of eaeh reading passage aeeording to a 
six-point seale ranging from straightforward text to more eomplex, ehalleng- 
ing and abstraet text. This seale was developed by noted reading experts 
who reviewed various eharaeteristies of passages, sueh as level or speeializa- 
tion of voeabulary, predietability of struetures or organization, eomplexity 
of syntax, level of abstraetness, familiarity of the topie and the number of 
eoneepts introdueed in the passage. Generally speaking. Level 1 represents 
upper-elementary reading levels. Level 2 and Level 3 represent middle 
sehool-level reading. Level 4 represents early-stage high sehool reading, and 
Level 5 and Level 6 represent late-stage high sehool reading. 
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Categories for eonsideration of reading passage diffieulty inelude: 



■ Stmcture 

• Narration 

• Deseription 

• Explanation 

• Instruetion 

• Argumentation 

■ Vocabulary 

• Poetie 

• Idiomatie 

• Teehnieal 

• Unusual/unfamiliar 

■ Syntax/connectives 

• Dialogue 

• Sentenee strueture 



■ Characters/ideas 

■ Narrator/stance 

■ Theme/message/moral 

■ Literary effects 

• Foreshadowing 

• Flashbaek 

• Irony 

■ Familiarity 

• Topie 

• Plaee 

• Time period 



Reading Rigor Index 

The Reading Rigor Index (RRI) is a method of determining how the eogni- 
tive demand of an item interaets with the level of a reading passage. This 
interaetion of eognitive demand level and reading level eontributes to the 
ehallenge of an item. For example, an item eould require a low level of 
performanee on a diffieult passage, a high level of performanee on an easy 
passage, a high level of performanee on a diffieult passage or a low level of 
performanee on an easy passage. The RRI seore is obtained by adding the 
eognitive demand level and the reading demand level for eaeh reading item 
on a test. The Cognitive Demand Seale ranges from a low of one to a high of 
five and the Reading Level Demand Seale from a low of one to a high of six, 
allowing for a total of 10 possible seores (from two to 11) that items ean 
aehieve on the RRI. 

The average RRI seore for eaeh test is ealeulated by weighting eaeh item 
aeeording to its point value and averaging the result. For example, an 
item based on a Level 3 reading passage with a eognitive demand of two 
would have an RRI seore of five. If the item were worth two points, as in 
a eonstrueted-response item, the item would be given double weight in eal- 
eulating the average RRI level of the test. 
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Cut Scores 



Each state determines the levels of profieieney its students must reaeh to 
pass the state’s exit exam based on sealed seores. The diffieulty in eompar- 
ing performanee levels and the eut seores that reveal these levels is that 
these sealed seores are unique to eaeh state’s exam and students. Without a 
eomparison sample — giving different state exams to the same group of stu- 
dents or giving a eommon exam to students in all six states — no eonnee- 
tions among these sealed seore distributions exist. Consequently, aside from 
a subjeetive analysis of profieieney-level setting proeedures, it has been 
impossible to determine objeetively if the profieieney levels set by different 
states have similar meaning. 

Aehieve, working with researehers from Miehigan State University, devel- 
oped a proeedure to establish eomparability of profieieney levels aeross 
states aeeording to the different dimensions by whieh the assessments ana- 
lyzed in this study have been eoded. Beeause the assessments from the six 
states in the original study were eoded item by item aeeording to eommon 
metries, it beeame possible to eompare what passing the assessments exaetly 
at the eut seore would mean, state to state. Aehieve ehose, in this study, to 
look at the mathematies eut seores through lens of the IGP index and the 
English language arts eut seores through the ACT index (both are deseribed 
above). 

States almost universally use Item Response Theory (IRT) models to seale 
assessment items and to estimate a sealed value for eaeh student. The eut 
seore is established in this metrie. Consequently, the eut seores (the seores 
needed simply to pass, not reaeh any level of greater profieieney) and seal- 
ing information provided by the states were used to determine sets of eor- 
reetly answered items — or passing “seenarios” — that allow students to 
reaeh the eut seore and the likelihood that those seenarios would oeeur. 
When eoupled with the IGP (for mathematies) or ACT (for English language 
arts) eodings of the items, the proeess transforms the eut seores into the 
eorresponding IGP or ACT metries. Comparisons of states’ eut seores are 
done in these metries. Beeause of the large number of potential passing 
seenarios (2" where n is the number of items or points on the test), only a 
random sample of 20,000 passing seenarios was used for the eomputation. 
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